Search CORE

10 research outputs found

Nesting optimization with adversarial games, meta-learning, and deep equilibrium models

Author: Micaelli Paul
Publication venue: The University of Edinburgh
Publication date: 29/08/2023
Field of study

Nested optimization, whereby an optimization problem is constrained by the solutions of other optimization problems, has recently seen a surge in its application to Deep Learning. While the study of such problems started nearly a century ago in the context of market theory, many of the algorithms developed since do not scale to modern Deep Learning applications. In this thesis, I push the understanding and applicability of nested optimization to three machine learning domains: 1) adversarial games, 2) meta-learning and 3) deep equilibrium models. For each domain, I tackle a particular goal. In 1) I adversarially learn model compression, in the case where training data isn't available, in 2) I meta-learn hyperparameters for long optimization processes without introducing greediness, and in 3) I use deep equilibrium models to improve temporal coherence in video landmark detection. The first part of my thesis deals with casting model compression as an adversarial game. Performing knowledge transfer from a large teacher network to a smaller student is a popular task in deep learning. However, due to growing dataset sizes and stricter privacy regulations, it is increasingly common not to have access to the data that was used to train the teacher. I propose a novel method which trains a student to match the predictions of its teacher without using any data or metadata. This is achieved by nesting the training optimization of the student with that of an adversarial generator, which searches for images on which the student poorly matches the teacher. These images are used to train the student in an online fashion. The student closely approximates its teacher for simple datasets like SVHN, and on CIFAR10 I improve on the state-of-the-art for few-shot distillation (with

100

images per class), despite using no data. Finally, I also propose a metric to quantify the degree of belief matching between teacher and student in the vicinity of decision boundaries, and observe a significantly higher match between the zero-shot student and the teacher, than between a student distilled with real data and the teacher. The second part of my thesis deals with meta-learning hyperparameters in the case when the nested optimization to be differentiated is itself solved by many gradient steps. Gradient-based hyperparameter optimization has earned a widespread popularity in the context of few-shot meta-learning, but remains broadly impractical for tasks with long horizons (many gradient steps), due to memory scaling and gradient degradation issues. A common workaround is to learn hyperparameters online, but this introduces greediness which comes with a significant performance drop. I propose forward-mode differentiation with sharing (FDS), a simple and efficient algorithm which tackles memory scaling issues with forward-mode differentiation, and gradient degradation issues by sharing hyperparameters that are contiguous in time. I provide theoretical guarantees about the noise reduction properties of my algorithm, and demonstrate its efficiency empirically by differentiating through

\sim 10^4

gradient steps of unrolled optimization. I consider large hyperparameter search ranges on CIFAR-10 where I significantly outperform greedy gradient-based alternatives, while achieving

\times 20

speedups compared to the state-of-the-art black-box methods. The third part of my thesis deals with converting deep equilibrium models to a form of nested optimization in order to perform robust video landmark detection. Cascaded computation, whereby predictions are recurrently refined over several stages, has been a persistent theme throughout the development of landmark detection models. I show that the recently proposed deep equilibrium model (DEQ) can be naturally adapted to this form of computation, given appropriate regularization. My landmark model achieves state-of-the-art performance on the challenging WFLW facial landmark dataset, reaching

3.92

normalized mean error with fewer parameters and a training memory cost of

\mathcal{O}(1)

in the number of recurrent modules. Furthermore, I show that DEQs are particularly suited for landmark detection in videos. In this setting, it is typical to train on still images due to the lack of labeled videos. This can lead to a ``flickering'' effect at inference time on video, whereby a model can rapidly oscillate between different plausible solutions across consecutive frames. I show that the DEQ root solving problem can be turned into a constrained optimization problem in a way that emulates recurrence at inference time, despite not having access to temporal data at training time. I call this "Recurrence without Recurrence'', and demonstrate that it helps reduce landmark flicker by introducing a new metric, and contributing a new facial landmark video dataset targeting landmark uncertainty. On the hard subset of this new dataset, made up of

500

videos, my model improves the accuracy and temporal coherence by

10

and

13\%

respectively, compared to the strongest previously published model using a hand-tuned conventional filter

Edinburgh Research Archive

Meta-Learning in Neural Networks: A Survey

Author: Antoniou Antreas
Hospedales Timothy M
Micaelli Paul
Storkey Amos J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/11/2020
Field of study

The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research

arXiv.org e-Print Archive

Edinburgh Research Explorer

Dynamic control of DHM for ergonomic assessments

Author: Andriot Claude
de Magistris Giovanni
Evrard Paul
Gaudez Clarisse
Marsot Jacques
Micaelli Alain
Savin Jonathan
Publication venue: 'Elsevier BV'
Publication date: 31/01/2013
Field of study

International audiencePhysical risk factors assessment is usually conducted by analysing postures and forces implemented by the operator during a work-task performance. A basic analysis can rely on questionnaires and video analysis, but more accurate comprehensive analysis generally requires complex expensive instrumentation, which may hamper movement task performance. In recent years, it has become possible to study the ergonomic aspects of a workstation from the initial design process, by using digital human model (DHM) software packages such as Pro/ENGINEER Manikin, JACK, RAMSIS or CATIA-DELMIA Human. However, a number of limitations concerning the use of DHM have been identified, for example biomechanical approximations, static calculation, description of the probable future situation or statistical data on human performance characteristics. Furthermore, the most common DHM used in the design process are controlled through inverse kinematic techniques, which may not be suitable for all situations to be simulated. A dynamic DHM automatically controlled in force and acceleration would therefore be an important contribution to analysing ergonomic aspects, especially when it comes to movement, applied forces and joint torques evaluation. Such a DHM would fill the gap between measurements made on the operator performing the task and simulations made using a static DHM. In this paper, we introduce the principles of a new autonomous dynamic DHM, then describe an application and validation case based on an industrial assembly task adapted and implemented in the laboratory. An ergonomic assessment of both the real task and the simulation was conducted based on analysing the operator/manikin's joint angles and applied force in accordance with machinery safety standards (Standard NF EN ISO 1005-1 to 5 and OCcupational Repetitive Actions (OCRA) index). Given minimum description parameters of the task and subject, our DHM provides a simulation whose ergonomic assessment agrees with experimental evaluation

HAL-CEA

Présentation d'un système de contrôle biomimétique d'humains virtuels par apprentissage

Author: De Magistris Giovanni
Evrard Paul
Micaelli Alain
Savin Jonathan
Publication venue: John Wiley & Sons
Publication date: 10/03/2015
Field of study

International audienceThis paper presents a new learning control framework for digital human models in a physics-based virtual environment. The novelty of our controller is that it combines multi-objective control based on human properties (combined feedforward and feedback controller) with a learning technique based on human learning properties (human-being's ability to learn novel task dynamics through the minimization of instability , error and effort). This controller performs multiple tasks simultaneously (balance, non-sliding contacts, manipulation) in real time and adapts feedforward force as well as impedance to counter environmental disturbances. It is very useful to deal with unstable manipulations , such as tool-use tasks, and to compensate for perturbations. An interesting property of our controller is that it is implemented in cartesian space with joint stiffness , damping and torque learning in a multi-objective control framework. The relevance of the proposed control method to model human motor adaptation has been demonstrated by various simulations.Cet article présente un système innovant de contrôle multi-objectifs d'humains virtuels respectant les lois du monde physique. Ce système associe certaines propriétés du comportement humain (combinaison de contrôle par anticipation et par rétroaction) et des capacités d'apprentissage (à la manière des êtres humains qui peuvent apprendre la dynamique de tâches nouvelles par minimisation de l'instabilité, des erreurs et des efforts). Il assure le contrôle de plusieurs tâches simultanées (gestion de l'équilibre, des contacts, de tâches de manipulation) en temps réel, ainsi que l'adaptation de l'impédance et de la force appliquée en anticipation, selon les perturbations extérieures. Ce type de contrôle est très utile pour la simulation réaliste par mannequins numériques de tâches de manipulation et d'utilisation d'outils. L'aspect novateur de ce contrôleur est qu'il s'applique à l'espace des raideur, amortissement et couples articulaires à partir de paramètres de l'espace cartésien. Cet article met en évidence l'intérêt de cette approche pour la modélisation de l'adaptation motrice humaine à travers différentes simulations

HAL-CEA

Temporalités et autobiographie

Author: Abella Adela
Baudouin Jean-Michel
Bors Edit
Bouffartigue Paul
Dubar Claude
Eraly Hélène
LAPORTE Cyrille
Leclerc Natalia
Melchior Jean-Philippe
Micaelli Isabelle
Pita Juan Carlos
Poblete Lorena
Samuel Olivia
Strasser Anne
Publication venue: Temporalités
Publication date
Field of study

OpenEdition

Projet MIPS : maitriser les investissements productiques strategiques

Author: ADDEO 33 - Bordeaux (France)
Blanchard Michel
Centre National de la Recherche Scientifique (CNRS) 69 - Lyon (France). Economie des Changements Technologiques
Cornut Alain
Fouet Jean-Marc
Glodas Henri
Jacot Jacques-Henri
Lyon-1 Univ. 69 - Villeurbanne (France). Lab. d'Ingenierie des Systemes d'Information (LISI)
Lyon-2 Univ. 69 (France). Economie des Changements Technologiques
Maurin Bernard
Micaelli Jean-Pierre
Peyrondet Jacques
Pole Productique Rhone Alpes (PPRA) 42 - Saint-Etienne (France)
Productica 33 - Talence (France)
Romand Paul
SARL Paul Romand Innovation (SPRI) 26 - Valence (France)
Publication venue
Publication date: 01/01/1994
Field of study

SIGLEAvailable from INIST (FR), Document Supply Service, under shelf-number : AR 16103 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueMinistere de l'Enseignement Superieur et de la Recherche, 75 - Paris (France)FRFranc

OpenGrey Repository

A human-like learning control for digital human models in a physics-based virtual environment

Author: A Miller
A Smith
Alain Micaelli
C Cheah
C Yang
D Chaffin
D Franklin
D Wolpert
E Burdet
E Burdet
E Castillo
E Occhipinti
E Perreault
E Todorov
G Magistris De
Giovanni De Magistris
H Gomi
J Buchli
J Lackner
J Slotine
J Won
JM Porter
Jonathan Savin
K Astrom
K Schaub
K Tee
L Sciavicco
M Kawato
N Andreiev
N Hogan
N Prakash
O Khatib
P Fitts
P Gribble
P Morasso
P Morasso
P Morasso
Paul Evrard
R Hyman
R Kirsch
R Shadmehr
S Haddadin
S Jagannathan
T Bretl
T Flash
T Milner
Y Uno
Z Bien
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref